Smooth Value and Policy Functions for Discounted Dynamic Programming
نویسنده
چکیده
We consider a discounted dynamic program in which the spaces of states and actions are smooth (in a sense that is suitable for the problem at hand) manifolds. We give conditions that insure that the optimal policy and the value function are smooth functions of the state when the discount factor is small. In addition, these functions vary in a Lipschitz manner as the reward function-discount factor pair varies in a neighborhood of the pair consisting of a given reward function and zero. Running Title: Smooth Value and Policy Functions
منابع مشابه
The value iteration algorithm is not strongly polynomial for discounted dynamic programming
This note provides a simple example demonstrating that, if exact computations are allowed, the number of iterations required for the value iteration algorithm to find an optimal policy for discounted dynamic programming problems may grow arbitrarily quickly with the size of the problem. In particular, the number of iterations can be exponential in the number of actions. Thus, unlike policy iter...
متن کاملAn Optimal Tax Relief Policy with Aligning Markov Chain and Dynamic Programming Approach
Abstract In this paper, Markov chain and dynamic programming were used to represent a suitable pattern for tax relief and tax evasion decrease based on tax earnings in Iran from 2005 to 2009. Results, by applying this model, showed that tax evasion were 6714 billion Rials**. With 4% relief to tax payers and by calculating present value of the received tax, it was reduced to 3108 billion Rials. ...
متن کاملAn Application of Discounted Residual Income for Capital Assets Pricing by Method Curve Fitting with Sinusoidal Functions
The basic model for valuation of firm is the Dividend Discount Model (DDM). When investors buy stocks, they expect to receive two types of cash flow: dividend in the period during which the stock is owned, and the expected sales price at the end of the period. In the extreme example, the investor keeps the stock until the company is liquidated; in such a case, the liquidating dividend becomes t...
متن کاملA Version of the Euler Equation in Discounted Markov Decision Processes
This paper deals with Markov decision processes MDPs on Euclidean spaces with an infinite horizon. An approach to study this kind of MDPs is using the dynamic programming technique DP . Then the optimal value function is characterized through the value iteration functions. The paper provides conditions that guarantee the convergence of maximizers of the value iteration functions to the optimal ...
متن کاملOptimal Policies for a Capacitated Two-Echelon Inventory System
This paper demonstrates optimal policies for capacitated serial multiechelon production/inventory systems. Extending the Clark and Scarf (1960) model to include installations with production capacity limits, we demonstrate that a modified echelon base-stock policy is optimal in a two-stage system when there is a smaller capacity at the downstream facility. This is shown by decomposing the dynam...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015